1 Data preparation

1.1 Load packages

  • Load packages: All necessary packages are loaded in from the libraries.r script in the /scripts folder.
  • Author dictionary: An author dictionary is created to manage the presentation of author names in the final tables and meta-analyses. This is particularly for those with punctuation in their names.
  • Load scripts: We have written functions to handle much of the Bayesian analysis
  • Load Bayesian datasets: The Bayesian approach relies on large simulated datasets derived from the observed data, for this report we only present the results graphically from data simulated outside of the report. The scripts bayes_scripts.R, bayesian_data_generation.R and bayesian_analysis.R should be able to be run locally to generate your own simulated data.

The code is all stored in the GitHub repository here

source(here("scripts", "author_dictionary.R"))

1.2 Download data

This is relatively automated in this script however, if you don’t have an API key you can download the file individually and store it in the /data_spreadsheet folder as data_extraction_current.xlsx, the sheets are then extracted from the file.

drive_download("https://docs.google.com/spreadsheets/d/1fmTEW0lHYDk1OE3fw9-mJ6xu9EgBPnYoaL05-OTRiOk/edit?usp=sharing", path = here("data_spreadsheet", "data_extraction_current.xlsx"), overwrite = T)

search_details <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'article_screening')
data_study_general <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'general_details')
table_1 <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'pop_descriptives')
table_2 <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'testing')
table_3 <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'hospitalisation')
table_4 <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'severity')
table_5 <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'mortality')
table_6 <- readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'updated_quality_appraisal')

We maintain a Google sheets document here which contains all eligible articles the research team have identified. This is currently viewable only and not open for editing. We have collected the general information for the search in the first sheet, the studies general information in the second, further information on study populations, outcomes of interest and quality of the studies are contained in the further sheets. These sheets are organised in a similar way to the manuscript and so are labelled table_1 to table_6.

We have extracted data from 578 studies, however, some of these are duplicates and some have been removed from the analysis set. For example there have been multiple versions of the ISARIC report and several publications have been retracted. We aim to keep on top of this but there is some risk that in between versions data may be superseded.

1.3 Raw Data

The first sheet is Search details we use this sheet to keep track of the number of studies found in our search and keep track of the number of titles and abstracts screened and number of papers included in each review version. It presents the data in groups of search periods.

flextable(search_details %>%
            mutate(`date_screening in format (2020-05-22)` = ymd(`date_screening in format (2020-05-22)`)))

The second sheet is General study details this includes the lead author, where the study is published, the date it was published, the country in which the data originated and the DoI or link to the publication for our records.

This is a searchable list of all the studies that have met out inclusion and exclusion criteria up until -Inf. These are live links and should direct to the relevant article, however, some may have been deleted or require institutional access.

The third sheet produces Table 1 this contains the details of the population of the studies in our review. Primarily includes smoking status but also the proportion of females in the study and the age of participants.

The fourth sheet produces Table 2 this contains the data extracted from relevant studies on testing data and smoking status. Most studies will not have data on this outcome however we filter these subsequently. These are sparse datasets as many studies do not report on all levels of smoking exposure.

The fifth sheet produces Table 3 this contains the data extracted from relevant studies on hospitalisation and smoking status.

The sixth sheet produces Table 4 this contains the data extracted from relevant studies disease severity data and smoking status. Disease severity is differentially reported across the studies in some there is reference to WHO severity score in others we have based the distinction on clinical characteristics or whether patients were admitted to Intensive or Critical care areas.

The seventh sheet produces Table 5 this contains the data extracted from relevant studies on COVID-19 related mortality and smoking status. A caution needed when interpreting this data is that most studies have not followed patients up for sufficient amounts of time to be able to accurately report on disease associated mortality.

The final sheet is used to collate data on Study quality. The method of assessing quality changed in version 3 of the review. We now use the following criteria:

  • Good:
    1. <20% missing data on smoking status and used a reliable self-report measure that distinguished between current, former and never smoking status AND
    2. biochemical verification of smoking status and reported results from adjusted analyses OR
    3. reported data from a representative/random sample
  • Fair
    1. <20% missing data on smoking status and used a reliable self-report measure that distinguished between current, former and never smoking status
  • Poor
    1. The remainder of the studies

1.4 Clean data and save updated versions as RDS files

The data from the google sheets document is then cleaned using a script from the linelist package which has been developed by T. Jombart and Z. Kamvar within the R Epidemics Consortium The function clean_data() ensures consistency among column names, country names, dates and missing values in the included sheets. Data is also saved as .rds files in the /data_clean folder of the project repository.

source(here("scripts", "clean_data.R"))

The package flextable can save tables in a format that is easily manageable within word which we currently use to write the manuscript (until I get better with RMarkdown). At various points I save these to the output_data folder.

2 Preparing the descriptive analysis

2.1 Choosing which data we want to include in the current review?

date_of_update <- Sys.Date()

prev_versions <- head(unique(data_study_general$review_version), n = -1)
#This can categorise which studies we want to look at
current_version <- tail(unique(data_study_general$review_version), n = 1)
#This will categorise which studies we are including in the current report

analysed_versions <- c(prev_versions, current_version)
#This will incorporate all studies into the current version of the report

versioning <- cbind(prev_versions, current_version, analysed_versions)
write_rds(versioning, here("data_clean", "versioning.rds"))

We define the current version of the report here. This report has been generated for version v12

superseded_studies <- data_study_general %>%
  filter(str_ends(lead_author, "_old")) %>%
  pull(lead_author)

old_isaric <- data_study_general %>%
  filter(str_starts(lead_author, "isaric")) %>%
  head(., -1) %>%
  pull(lead_author)
#This will prevent duplication of the ISARIC reports, only retaining the last based on their names

excluded_other_reason <- c("mehra", #retracted
                           "hopkinson", #self-reported
                           "gaibazzi", #withdrawn
                           "kua" #self-reported
                           )

exclude_from_analysis <- unique(c(superseded_studies, old_isaric, excluded_other_reason))
write_rds(exclude_from_analysis, here("output_data", "excluded_studies.rds"))

We don’t want to remove studies from the dataset despite them no longer being included in the analysis. The excluded studies are defined in this chunk of code. We have excluded this version the following studies:

petrilli_old, miyara_old, magagnoli_old, niedzwiedz_old, miyara_2_old, shi_zuo_old, bello_chavolla_old, sami_old, kimmig_old, russell_old, zuo_zuo_old, patel_old, pongpirul_old, favara_old, dashti_old, isaric_5_old, tao_old, martinez_lacalzada_old, sun_old, isaric_1, isaric_2, isaric_3, isaric_4, isaric_6, isaric, isaric_10, mehra, hopkinson, gaibazzi, and kua

Generally we label these as _old in the Google sheet but there are exceptions to this so we manually define them.

2.2 Filter the included studies

Based on the version and excluded studies we produce the tables for analysis.

This is a searchable table.

source(here("scripts", "remove_excluded.R")) # This removes the excluded studies from each table that has been read in previously.
current_total <- nrow(a)
datatable(a)
a <- flextable(a)%>%
  set_table_properties(width = 1, layout = 'autofit') %>%
  set_caption(caption = 'Studies included in the current analysis')

save_as_docx(a, path = here('output_data', 'included_studies.docx'))

3 Study descriptive analysis

3.1 The PRISMA diagram will be updated

The PRISMA reporting guidelines for systematic reviews the guidance is available here. This takes the format of the number of articles obtained from the search from OVID (login may be needed), medRxiv, and articles identified by the research team. These are then screened at the title and abstract level. Full texts are then read to extract relevant data, if articles are excluded at this level we describe the reason for this. All of the selected articles will contribute to the narrative synthesis but only those that are judged ‘good’ or ‘fair’ quality will be included in the Bayesian meta-analysis.

previous_total <- read_rds(here("data_clean", "data_study_general.rds")) %>%
  filter(review_version %in% prev_versions) %>%
  nrow()

prisma <- search_details %>%
  group_by(review_version) %>%
  filter(review_version == current_version) %>%
  mutate(title_abstract = sum(ovid_titles, medrxiv_titles, other_source_included),
         full_texts_assessed = sum(ovid_abstract_full, medrxiv_abstract_full, other_source_included),
         title_abstract_included = sum(ovid_included, medrxiv_included, other_source_included),
         title_abstract_excluded = title_abstract - full_texts_assessed,
         full_text_excluded = full_texts_assessed-total_included,
         full_text_included = full_texts_assessed-full_text_excluded,
         previous_total = previous_total,
         previous_minus_superseded = previous_total-superseded,
         current_total = current_total) %>%
  select("Review version" = review_version,
         "Date Screened" = date_screened,
         "OVID results" = "ovid_titles",
         "medRxiv results" = "medrxiv_titles",
         "Other sources" = "other_source_included",
         "Titles/abstracts screened" = "title_abstract",
         "Titles/abstracts excluded" = "title_abstract_excluded",
         "Full texts screened" = "full_texts_assessed",
         "Full texts excluded" = "full_text_excluded",
         "Records included" = "full_text_included",
         "Records carried forward" = "previous_minus_superseded",
         "Supersed from previous version" = superseded,
         "Studies in narrative synthesis" = "current_total")
flextable(prisma)

The number of studies contributing to meta-analysis is calculated later.

3.2 We summarise where the studies were conducted

As the pandemic continues studies are being reported from an increasing numbers of countries.

#Countries
country <- table_1$country %>%
  to_upper_camel_case(sep_out = " ") %>%
  recode(., 'Usa' = 'USA', 'Uk' = 'UK') %>%
  table() %>%
  as.data.frame() %>%
  arrange(desc(Freq)) %>%
  rename('Country' = 1,
         'Number' = 2)

all_countries <- country %>%
  group_split(-Number)

ggplot(country, aes(x = reorder(Country, desc(-Number)), y = Number))+
  geom_col()+
  coord_flip()+
  theme_bw()+
  labs(title = 'Countries where studies were performed',
       y = 'Number of studies',
       x = 'Country')

a <- flextable(country) %>%
  set_table_properties(width = 0.5, layout = 'autofit')
a
country <- country %>%
  filter(Country != "Multiple")

Currently studies are reported from 51 countries.

countries <- ne_countries(type = "countries", returnclass = "sf")

map <- country %>%
  mutate(iso3c = countrycode::countrycode(Country, origin = "country.name", destination = "iso3c")) %>%
  left_join(., countries %>%
              as_tibble() %>%
              dplyr::select(adm0_a3, geometry), by = c("iso3c" = "adm0_a3")) %>%
  st_as_sf()
  
country_bounds <- map %>%
  st_bbox()

map_world <- map_data(map = "world") %>%
  filter(region != "Antarctica")

save_as_docx(a, path = here('output_data', 'Countries_performing_studies.docx'))  

3.3 A map is produced for where the studies are performed

summary_df <- coronavirus %>% 
  filter(type == "confirmed") %>%
  group_by(country) %>%
  summarise(total_cases = sum(cases)) %>%
  arrange(-total_cases) %>%
  rename("name" = country) %>%
  mutate(name = recode(name,
         "US" = "United States",
         "Czechia" = "Czech Rep.",
         "Korea, South" = "Korea"))

country_plot <- left_join(country_map  %>%
            rename("name" = Country),
            countries %>%
              select(name),
          by = "name") %>%
  left_join(., summary_df %>%
              mutate(total_cases = total_cases/1000),
            by = "name") %>%
  mutate(name = as.factor(name)) %>%
  st_as_sf()

country_plot$Number[ which(country_plot$Number == 0)] = NA
country_plot$total_cases[ which(is.na(country_plot$total_cases))] = 0


my_bins <- c(0, 1, 5, 10, 20, 50, 70, max(country$Number))
mypalette <- colorBin(palette="YlOrBr", domain = country_plot$Number, na.color="transparent", bins=my_bins)

mytext <- paste(
  "Country: ", country_plot$name, "<br/>", 
  "Number of studies included: ", country_plot$Number, "<br/>", 
  "COVID-19 reported cases (thousands): ", round(country_plot$total_cases, 2), 
  sep="") %>%
  lapply(htmltools::HTML)

html_map <- leaflet(country_plot) %>% 
  addTiles()  %>% 
  setView(lat=10, lng=0 , zoom=2) %>%
  addPolygons( 
    fillColor = ~mypalette(Number), 
    stroke=TRUE, 
    fillOpacity = 0.9, 
    color="white", 
    weight=0.3,
    label = mytext,
    labelOptions = labelOptions( 
      style = list("font-weight" = "normal", padding = "3px 8px"), 
      textsize = "13px", 
      direction = "auto"
    )
  ) %>%
  addLegend( pal=mypalette, values=~Number, opacity=0.9, title = "Number of studies included", position = "bottomright" )

html_map

This map displays the number of studies produced in each country. The number of confirmed COVID-19 cases are shown when resting over each country for reference, data on confirmed cases have been obtained from Rami Krispin’s coronavirus package. The number of cases is updated when the report is generated so will not remain in date, it is current as of 2021-07-26 22:01:09. Countries displaying 0 cases are most likely due to naming mismatches.

3.4 We sumarise the settings in which they were conducted

Most studies are performed in hospital settings. Some studies are conducted in both such as those in emergency departments where patients are discharged home and followed up. Some registry studies utilising primary care networks are solely conducted in the community.

#Setting
setting <- to_upper_camel_case(data_study_general$study_setting, sep_out = ' ') %>%
  table(.) %>%
  sort('Number of studies', decreasing = T) %>%
  as.data.frame() %>%
  arrange(desc(Freq)) %>%
  rename('Setting' = 1,
         'Number' = 2)

a <- flextable(setting) %>%
  set_table_properties(width = 0.5, layout = 'autofit')
a
save_as_docx(a, path = here('output_data', 'Countries_performing_studies.docx'))  

3.5 We summarise the design of the included studies

The majority were retrospective cohorts. There were fewer case-control and cross-sectional studies. There were some prospective cohorts established and reported.

design <- to_upper_camel_case(data_study_general$study_design, sep_out = ' ') %>%
  recode("Rct" = "RCT") %>%
  table(.) %>%
  sort('Number of studies', decreasing = T) %>%
  as.data.frame() %>%
  arrange(desc(Freq)) %>%
  rename('Design' = 1,
         'Number' = 2)

a <- flextable(design) %>%
  set_table_properties(width = 0.5, layout = 'autofit')
a
save_as_docx(a, path = here('output_data', 'Study_designs.docx'))  

##For supplementary tables
study_design <- data_study_general %>%
  select(lead_author, study_design) %>%
  mutate(study_design = to_upper_camel_case(study_design, sep_out = " "),
         study_design = ifelse(study_design == "Rct", "Randomised Controlled Trial", study_design))

3.6 We update the number of participants we have included

  • Summary statistics: this table will give us the mean, median and IQR of the samples in the review
#Numbers
participants <- summary(as.numeric(table_1$sample_size))
  • Total number of participants: 72684436

3.7 We want to know if they use a CRF or electronic health record

The majority of data on smoking outcomes is extracted from electronic health records that are routinely collected in the provision of healthcare. Some of the studies utilise specifically designed Case Report Forms (CRFs). The source of information on smoking status is not always reported.

#Source of data
data_source <- table_1$data_source %>%
  recode('electronic_health_records' = 'Electronic health record',
         'case_report_form' = 'Case report form',
         'not_stated' = 'Not stated') %>%
  table() %>%
  as.data.frame() %>%
  arrange(desc(Freq)) %>%
  rename('Data source' = 1,
         'Number' = 2) 

data_source %>%
  flextable() %>%
  set_table_properties(width = 0.5, layout = 'autofit')

3.8 Classification of COVID

diagnostics <- table(data_study_general %>%
                       mutate(diagnostic_approach = case_when(is.na(diagnostic_approach) ~ "not_specified",
                                                              TRUE ~ diagnostic_approach)) %>%
                                pull(diagnostic_approach))
diagnostics
## 
##            ab           lft            no  not_reported not_specified           pcr        pcr_ab           yes 
##            30             1             5             1           191           295            23             1

4 Smoking descriptive data

#Studies reporting smokers
current_smok <- table_1 %>%
  filter(current_smoker != 'NA' | current_vaper != "NA" | current_smoker_vaper != 'NA' | current_smoker_former_vaper != 'NA')
#Studies reporting former smokers
former_smok <- table_1 %>%
  filter(., former_smoker != 'NA' | former_smoker_vaper != 'NA' | former_smoker_former_vaper != 'NA')
#Studies reporting never smokers
never_smok <- table_1 %>%
  filter(., never_smoker != 'NA' | never_smoker_vaper != 'NA' | never_smoker_former_vaper != 'NA')
#Studies reporting current/former smokers
current_former_smok <- table_1 %>%
  filter(., current_former_smoker != 'NA')
#Studies reporting missing data
missing_smok <- table_1 %>%
  filter(., missing != 'NA') %>%
  mutate(percent = missing/total*100)
minimum <- min(missing_smok$percent)
maximum <- max(missing_smok$percent)
#Studies reporting never/unknown
never_smok_unknown <- table_1 %>%
  filter(., never_smoker_unknown != 'NA')

#Studies with not stated
smok_not_stated <- table_1 %>%
  filter(., not_stated != 'NA')

4.1 Studies reporting current, former and never smoking status

#Studies reporting current, former and never smoking status
full_smoking_status <- table_1 %>%
  filter(lead_author %in% current_smok$lead_author & 
           lead_author %in% former_smok$lead_author & 
           lead_author %in% never_smok$lead_author)

b <- full_smoking_status %>%
  select(lead_author, sample_size, current_smoker, former_smoker, never_smoker, current_vaper,
         current_smoker_vaper, former_smoker_vaper, never_smoker_vaper,
         current_smoker_former_vaper, former_smoker_former_vaper, never_smoker_former_vaper,
         study_id) %>%
  rename('Lead author' = lead_author,
         'Sample size' = sample_size,
         'Current smokers' = current_smoker,
         'Current vaper' = current_vaper,
         'Former smokers' = former_smoker,
         'Never smokers' = never_smoker,
         "Current smoker, current vaper" = current_smoker_vaper, 
         "Former smoker, current vaper" = former_smoker_vaper,
         "Never smoker, current vaper" = never_smoker_vaper,
         "Current smoker, former vaper" = current_smoker_former_vaper,
         "Former smoker, former vaper" = former_smoker_former_vaper, 
         "Never smoker, former vaper" = never_smoker_former_vaper,
         'Study ID' = study_id)

b$`Lead author` <- to_upper_camel_case(b$`Lead author`, sep_out = ", ")
b$`Lead author` <- plyr::mapvalues(b$`Lead author`,
                               from = cleaned_names,
                               to = correct_names)  
datatable(b,
          caption = "Studies providing data on current, former and never smokers") %>%
  formatRound(columns = c(2:5), digits = 0, interval = 3, mark = ",")
#These are the studies with data on current, former and never smokers
set_flextable_defaults(digits = 0, big.mark = ",", na_str = "-")

b <- b %>%
  select(-"Study ID") %>%
  flextable() %>%
    set_caption(caption = 'Studies reporting complete smoking status') %>%
  colformat_double(j = c("Sample size", "Current smokers", "Former smokers", "Never smokers", "Current vaper",
                         "Current smoker, current vaper", "Former smoker, current vaper", "Never smoker, current vaper",
                         "Current smoker, former vaper", "Former smoker, former vaper", "Never smoker, former vaper"),
                digits = get_flextable_defaults()$digits, big.mark = get_flextable_defaults()$big.mark, na_str = get_flextable_defaults()$na_str) %>%
  set_table_properties(width = 1, layout = 'autofit')

save_as_docx(b, path = here('output_data', 'Supplementary_1a.docx'))
  • Studies with complete smoking status: 171, 31.26%

4.2 Studies providing partially complete data on smoking status

#Studies reporting current or current/former and never smoking
semi_full_smoking_status <- table_1 %>%
  filter(lead_author %in% current_former_smok$lead_author &
           lead_author %in% never_smok$lead_author &
           !lead_author %in% full_smoking_status$lead_author)

b <- semi_full_smoking_status %>%
  select(lead_author, sample_size, current_former_smoker, never_smoker, study_id) %>%
  rename('Lead author' = lead_author,
         'Sample size' = sample_size,
         'Current/former smokers' = current_former_smoker,
         'Never smokers' = never_smoker,
         'Study ID' = study_id)

b$`Lead author` <- to_upper_camel_case(b$`Lead author`, sep_out = ", ")
b$`Lead author` <- plyr::mapvalues(b$`Lead author`,
                               from = cleaned_names,
                               to = correct_names)  

datatable(b,
          caption = "Studies providing partially complete data on smoking status") %>%
  formatRound(columns = c(2:4), digits = 0, interval = 3, mark = ",")
b <- b %>%
  select(-"Study ID") %>%
  flextable() %>%
    set_caption(caption = 'Studies reporting complete smoking status') %>%
  colformat_double(j = c("Sample size", "Current/former smokers", "Never smokers"),
                digits = get_flextable_defaults()$digits, big.mark = get_flextable_defaults()$big.mark, na_str = get_flextable_defaults()$na_str) %>%
  set_table_properties(width = 1, layout = 'autofit')

save_as_docx(b, path = here('output_data', 'Supplementary_1b.docx'))
  • Studies reporting partially incomplete smoking data: 40
    These studies typically report smoking history as the measure of exposure which is why they have grouped values for current/former smokers

4.3 Studies providing incomplete data on smoking status

#Remaining studies
incomplete_smoking_status <- table_1 %>%
  filter(!lead_author %in% full_smoking_status$lead_author) %>%
  filter(!lead_author %in% semi_full_smoking_status$lead_author)

b <- incomplete_smoking_status %>%
  select(lead_author, sample_size, current_smoker, current_former_smoker, former_smoker, never_smoker, never_smoker_unknown, not_stated, missing, study_id) %>%
  rename('Lead author' = lead_author,
         'Sample size' = sample_size,
         'Current smokers' = current_smoker,
         'Current/former smokers' = current_former_smoker,
         'Former smoker' = former_smoker,
         'Never smokers' = never_smoker,
         'Never smoker/unknown' = never_smoker_unknown,
         'Not stated' = not_stated,
         'Missing' = missing,
         'Study ID' = study_id)

b$`Lead author` <- to_upper_camel_case(b$`Lead author`, sep_out = ", ")
b$`Lead author` <- plyr::mapvalues(b$`Lead author`,
                               from = cleaned_names,
                               to = correct_names)  

datatable(b,
          caption = "Studies providing incomplete data on smoking status") %>%
  formatRound(columns = c(2:8), digits = 0, interval = 3, mark = ",")
numeric_names <- colnames(b[2:9])
b <- b %>%
  select(-"Study ID") %>%
  flextable() %>%
    set_caption(caption = 'Studies reporting complete smoking status') %>%
  colformat_double(j = numeric_names,
                digits = 0, na_str = '-', big.mark = ',') %>%
  set_table_properties(width = 1, layout = 'autofit')

save_as_docx(b, path = here('output_data', 'Supplementary_1c.docx'))
  • Studies missing key smoking data: 336

5 Smoking prevalence by country

#Smoking prevalence by country
country_prevalence_list <- table_1 %>%
    group_by(country, .add = T) %>%
  mutate(current_smok_percentage = current_smoker/total*100,
         former_smok_percentage = former_smoker/total*100,
         missing_percentage = missing/total*100,
         not_stated_percentage = not_stated/total*100,
         never_smoker_percentage = never_smoker/total*100) %>%
  group_split()

source(here("scripts", "smoking_plots.R"))

We have split the data into a list where each country can be analysed separately if required.

Prevalence graphs are produced in the smoking_plots.R script. This uses data extracted on current and former smoking prevalence data from countries which have included studies in this review. We calculate the mean weighted prevalence.

5.1 Graph for smoking prevalence

Current smoking prevalence The studies are grouped by the country in which they were conducted. The dashed line represents national current smoking prevalence obtained from national smoking survey estimates. The size of the shape corresponds to the relative size of the study sample. The weighted mean current smoking prevalence is displayed for each setting.

Country smoking prevalence

5.2 Graph for former smoking prevalence

Former smoker prevalence This graph is for former smoking prevalence. The data is more sparse for this exposure and therefore several countries are missing for the analysis. Studies are further grouped by the setting in which they were conducted, Hospital (Triangle), Community (Square) and Community and Hospital (Circle). The size of the shape corresponds to the relative size of the study sample. The weighted mean current smoking prevalence is displayed for each setting. Country former smoking prevalence

6 Final results tables and meta-analyses

6.1 Table 1: Included studies

#table 1 for references
table_1_refs <- table_1 %>%
  left_join(., data_study_general %>%
              select(lead_author, doi),
            by = "lead_author")
#Updating table 1
table_1_word <- table_1 %>%
  mutate(current_percentage = current_smoker/total*100,
         former_percentage = former_smoker/total*100,
         current_former_percentage = current_former_smoker/total*100,
         never_smoker_percentage = never_smoker/total*100,
         never_smoker_unknown_percentage = never_smoker_unknown/total*100,
         current_vaper_p = current_vaper/total*100,
         current_smoker_current_vaper_p = current_smoker_vaper/total*100,
         former_smoker_current_vaper_p = former_smoker_vaper/total*100,
         never_smoker_current_vaper_p = never_smoker_vaper/total*100,
         current_smoker_former_vaper_p = current_smoker_former_vaper/total*100,
         former_smoker_former_vaper_p = former_smoker_former_vaper/total*100,
         never_smoker_former_vaper_p = never_smoker_former_vaper/total*100,
         not_stated_percentage = not_stated/total*100,
         missing_percentage = missing/total*100) %>%
  select(lead_author, date_published, country, sample_size, median_age, iqr_lower, iqr_upper, mean_age, lower_range,
         upper_range, standard_deviation, female_sex_percent, 
         current_percentage, current_vaper_p, current_smoker_current_vaper_p, current_smoker_former_vaper_p,
         former_percentage, former_smoker_current_vaper_p, former_smoker_former_vaper_p,
         current_former_percentage,
         never_smoker_percentage, never_smoker_current_vaper_p, never_smoker_former_vaper_p,
         never_smoker_unknown_percentage, not_stated_percentage, missing_percentage, study_id) %>%
  replace_na(., list(not_stated_percentage = 0, missing_percentage = 0)) %>%
  mutate(., missing_percentage = not_stated_percentage + missing_percentage,
         missing_percentage = formatC(missing_percentage, digits = 2, format = "f"))

#Smoking completeness
smoking_status <- as_tibble(full_smoking_status$lead_author) %>%
  rename(lead_author = 1) %>%
  mutate(complete_status = c("Yes")) %>%
  bind_rows(., semi_full_smoking_status %>%
              select(lead_author)) %>%
  bind_rows(., incomplete_smoking_status %>%
              select(lead_author)) %>%
  distinct() %>%
  mutate(complete_status = replace_na(complete_status, "No"))

#Data missingness
missingness <- table_6 %>%
  select(lead_author) %>%
  left_join(., table_1_word %>%
              select(lead_author, missing_percentage, never_smoker_unknown_percentage)) %>%
  group_by(lead_author) %>%
  mutate(missing_percentage = as.numeric(missing_percentage),
         missing = sum(missing_percentage, never_smoker_unknown_percentage, na.rm = T),
         low_missingness = ifelse(missing >= 20, "No", "Yes")) %>%
  distinct(., .keep_all = T)

#Defining poor quality
poor_quality <- missingness %>%
  select(-missing_percentage) %>%
  left_join(., smoking_status, by = "lead_author") %>%
  mutate(study_quality = ifelse(low_missingness == "No" | complete_status == "No", "Poor", "Not_poor")) %>%
  distinct(., .keep_all = T)

good_quality <- table_6 %>%
  left_join(., poor_quality, by = "lead_author") %>%
  select(lead_author, study_quality, biochemical_verification, random_representative_sample) %>%
  mutate(study_quality_final = ifelse(study_quality == "Not_poor" & (biochemical_verification == "Yes" | random_representative_sample == "Yes"), "good",
                                      ifelse(study_quality == "Poor", "poor", "fair")))  %>%
  distinct(., .keep_all = T)

quality_rating <- good_quality %>%
  select(lead_author, study_quality_final) %>%
  rename("overall_quality" = study_quality_final)

fair_quality_number <- quality_rating %>%
  filter(overall_quality == "fair")

poor_quality_number <- quality_rating %>%
  filter(overall_quality == "poor")

table_1_word <- left_join(table_1_word, good_quality %>%
                            select(lead_author, study_quality_final))

a <- data_study_general %>%
  select(study_id, study_setting)

table_1_word <- left_join(table_1_word, a, by = 'study_id')
table_1_word$date_published <- as.Date.character(table_1_word$date_published)
write_rds(table_1_word, here::here('data_clean', 'table_1_word.rds'))

table_1_word <- table_1_word %>%
  mutate(median_mean = ifelse(is.na(median_age), mean_age, median_age))

a <- table_1_word %>%
  mutate(median_mean = ifelse(is.na(median_age), mean_age, median_age),
         mean_used = ifelse(is.na(mean_age), '','^'),
         iqr = ifelse(is.na(iqr_lower), NA, paste(iqr_lower, iqr_upper, sep = '-')),
         range_combined = paste(lower_range, upper_range, sep = '-'),
         range_combined = na_if(range_combined, 'NA-NA'),
         standard_deviation = as.numeric(standard_deviation),
         st_dev = paste((as.integer(median_mean-standard_deviation)), as.integer((median_mean+standard_deviation)), sep = '-'),
         st_dev = na_if(st_dev, 'NA-NA'))

a$iqr <- coalesce(a$iqr, a$range_combined, a$st_dev)

a <- a %>%
  select(study_id, lead_author, date_published, country, sample_size, study_setting, median_mean, mean_used, iqr,
         female_sex_percent,
         current_percentage, current_vaper_p, current_smoker_current_vaper_p, current_smoker_former_vaper_p,
         former_percentage, former_smoker_current_vaper_p, former_smoker_former_vaper_p,
         current_former_percentage,
         never_smoker_percentage, never_smoker_current_vaper_p, never_smoker_former_vaper_p,
         never_smoker_unknown_percentage, 
         missing_percentage, study_quality_final) %>%
  mutate(median_mean = paste(median_mean, mean_used, sep = ''),
         median_mean = na_if(median_mean, NA)) %>%
  select(-mean_used) %>%
  mutate(median_mean = ifelse(median_mean == 'NA', 'NA', paste(paste(median_mean, iqr, sep = ' ('),')', sep = ''))) %>%
  select(-iqr) %>%
  left_join(., table_1_refs %>%
              select(lead_author, doi),
            by = "lead_author") %>%
  rename('Study ID' = study_id,
         'Lead author' = lead_author,
         'Date published' = date_published,
         'Country' = country,
         'Sample size' = sample_size,
         'Study setting' = study_setting,
         'Median (IQR)' = median_mean,
         'Female %' = female_sex_percent,
         'Current smoker %' = current_percentage,
         "Current vaper %" = current_vaper_p,
         "Current smoker, current vaper %" = current_smoker_current_vaper_p,
         "Current smoker, former vaper %" = current_smoker_former_vaper_p,
         'Former smoker %' = former_percentage,
         "Former smoker, current vaper %" = former_smoker_current_vaper_p,
         "Former smoker, former vaper %" = former_smoker_former_vaper_p,
         'Current/former smokers %' = current_former_percentage,
         'Never smokers %' = never_smoker_percentage,
         "Never smoker, current vaper %" = never_smoker_current_vaper_p,
         "Never smoker, former vaper %" = never_smoker_former_vaper_p,
         'Never/unknown smokers %' = never_smoker_unknown_percentage,
         'Missing %' = missing_percentage,
         "Study quality" = study_quality_final)

a$`Lead author` <- to_upper_camel_case(a$`Lead author`, sep_out = ", ")
author_list <- plyr::mapvalues(a$`Lead author`,
                               from = cleaned_names,
                               to = correct_names)

a$`Lead author` <- author_list

a$`Study setting` <-to_title_case(a$`Study setting`, sep_out = " ")
a$`Country` <-to_title_case(a$`Country`, sep_out = " ")
a$`Country` <- a$Country %>%
  recode('Usa' = 'USA',
         'Uk' = 'UK')

numeric_columns <- c('Median (IQR)', 'Female %', 'Current smoker %', "Current vaper %", "Current smoker, current vaper %" , "Current smoker, former vaper %",
                     'Former smoker %', "Former smoker, current vaper %", "Former smoker, former vaper %",
                     'Current/former smokers %',
                     'Never smokers %', "Never smoker, current vaper %", "Never smoker, former vaper %",
                     'Never/unknown smokers %', 'Missing %')

saveRDS(a %>%
          select(doi, everything()) %>%
          select(-"Study ID") %>%
          rename("ref" = doi),
        file = here("data_clean", "table_1_refs.rds"))

# The references in table script pulls in the doi information from the manuscript.bib file in reports
source(here("scripts", "references_in_table.R"))

The total number of included studies is 547. This table is searchable. The references are added in automatically from Zotero. The bibtex file is saved as manuscript.bib and is updated following each search.

datatable(a) %>%
  formatRound(
    columns = c(8:21),
    digits = 1,
    interval = 3,
    mark = ",",
    dec.mark = getOption("OutDec")
)
a <- flextable(a) %>%
    set_caption(caption = 'Characteristics of included studies') %>%
  colformat_double(j = numeric_columns, digits = 1, na_str = '-', big.mark = ',') %>%
  colformat_double(j = 'Sample size', digits = 0, na_str = '-', big.mark = ',') %>%
  set_table_properties(width = 1, layout = 'autofit')

saveRDS(a, file = here("reports", "tables", "table_1.rds"))
save_as_docx(a, path = here('output_data', 'Table_1.docx'))

6.2 Table 2: Smoking and COVID-19 diagnostic testing

#Table 2
table_2_word <-  table_2 %>%
  mutate(., sample = contributing_sample,
         negative_test_percentage =formatC(negative_test/sample*100, digits = 2, format = "f"),
         negative_current_percentage = formatC(negative_current_smoker/negative_test*100, digits = 2, format = "f"),
         negative_former_smoker_percentage = formatC(negative_former_smoker/negative_test*100, digits = 2, format = "f"),
         negative_current_former_smoker_percentage = formatC(negative_current_former_smoker/negative_test*100, digits = 2, format = "f"),
         negative_never_smoker_percentage = formatC(negative_never_smoker/negative_test*100, digits = 2, format = "f"),
         negative_current_vaper_p = formatC(negative_current_vaper/negative_test*100, digits = 2, format = "f"),
         negative_current_smoker_vaper_p = formatC(negative_current_smoker_vaper/negative_test*100, digits = 2, format = "f"),
         negative_not_stated_percentage = formatC(negative_not_stated/negative_test*100, digits = 2, format = "f"),
         positive_test_percentage = formatC(positive_test/sample*100, digits = 2, format = "f"),
         positive_current_smoker_percentage = formatC(positive_current_smoker/positive_test*100, digits = 2, format = "f"),
         positive_former_smoker_percentage = formatC(positive_former_smoker/positive_test*100, digits = 2, format = "f"),
         positive_current_former_smoker_percentage = formatC(positive_current_former_smoker/positive_test*100, digits = 2, format = "f"),
         positive_never_smoker_percentage = formatC(positive_never_smoker/positive_test*100, digits = 2, format = "f"),
         positive_never_former_percentage = formatC(positive_never_former/positive_test*100, digits = 2, format = "f"),
         positive_current_vaper_p = formatC(postitive_current_vaper/positive_test*100, digits = 2, format = "f"),
         positive_current_smoker_vaper_p = formatC(positive_current_smoker_vaper/positive_test*100, digits = 2, format = "f"),
         positive_not_stated_percentage = formatC(positive_not_stated/positive_test*100, digits = 2, format = "f")) %>%
  select(-data_on_testing, -missing, -date_published, -sample, -c(negative_current_former_vaper, negative_former_vaper, positive_current_former_vaper, positive_former_vaper)) %>%
  write_rds(here::here('data_clean', 'table_2_word.rds'))

quality_table_2 <- table_2_word %>%
  left_join(., quality_rating, by = 'lead_author')

a <- table_2_word %>%
  filter(contributing_sample >= 1) %>%
  mutate(Author = lead_author,
         Population_tested = contributing_sample,
         SARS_CoV_2_negative = paste(
           paste(negative_test, 
                 (negative_test_percentage), sep = " ("), "%)", sep = ""),
         N_current_smoker = paste(paste(negative_current_smoker, (negative_current_percentage), sep = " (")
                                  , "%)", sep = ""),
         N_current_smoker = na_if(N_current_smoker, "NA ( NA%)"),
         N_former_smoker = paste(paste(negative_former_smoker, (negative_former_smoker_percentage), sep = " (")
                                 , "%)", sep = ""),
         N_former_smoker = na_if(N_former_smoker, "NA ( NA%)"),
         N_current_former_smoker = paste(paste(negative_current_former_smoker,
                                               (negative_current_former_smoker_percentage), sep = " (")
                                         , "%)", sep = ""),
         N_current_former_smoker = na_if(N_current_former_smoker, "NA ( NA%)"),
         N_never_smoker = paste(paste(negative_never_smoker,
                                      (negative_never_smoker_percentage), sep = " (")
                                , "%)", sep = ""),
         N_never_smoker = na_if(N_never_smoker, "NA ( NA%)"),
         N_current_vaper = paste(paste(negative_current_vaper,
                                       (negative_current_vaper_p), sep = " (")
                                 , "%)", sep = ""),
         N_current_vaper = na_if(N_current_vaper, "NA ( NA%)"),
         N_current_smoker_vaper = paste(paste(negative_current_smoker_vaper,
                                              (negative_current_smoker_vaper_p), sep = " (")
                                        , "%)", sep = ""),
         N_current_smoker_vaper = na_if(N_current_smoker_vaper, "NA ( NA%)"),
         N_not_stated = paste(paste(negative_not_stated,
                                    (negative_not_stated_percentage), sep = " (")
                              , "%)", sep = ""),
         N_not_stated = na_if(N_not_stated, "NA ( NA%)")) %>%
  mutate(SARS_CoV_2_positive = paste(
    paste(positive_test, 
          (positive_test_percentage), sep = " ("), "%)", sep = ""),
    P_current_smoker = paste(paste(positive_current_smoker, (positive_current_smoker_percentage), sep = " (")
                             , "%)", sep = ""),
    P_current_smoker = na_if(P_current_smoker, "NA ( NA%)"),
    P_former_smoker = paste(paste(positive_former_smoker, (positive_former_smoker_percentage), sep = " (")
                            , "%)", sep = ""),
    P_former_smoker = na_if(P_former_smoker, "NA ( NA%)"),
    P_current_former_smoker = paste(paste(positive_current_former_smoker,
                                          (positive_current_former_smoker_percentage), sep = " (")
                                    , "%)", sep = ""),
    P_current_former_smoker = na_if(P_current_former_smoker, "NA ( NA%)"),
    P_never_smoker = paste(paste(positive_never_smoker,
                                 (positive_never_smoker_percentage), sep = " (")
                           , "%)", sep = ""),
    P_never_smoker = na_if(P_never_smoker, "NA ( NA%)"),
    P_current_vaper = paste(paste(postitive_current_vaper,
                                  (positive_current_vaper_p), sep = " (")
                            , "%)", sep = ""),
    P_current_vaper = na_if(P_current_vaper, "NA ( NA%)"),
    P_current_smoker_vaper = paste(paste(positive_current_smoker_vaper,
                                         (positive_current_smoker_vaper_p), sep = " (")
                                   , "%)", sep = ""),
    P_current_smoker_vaper = na_if(P_current_smoker_vaper, "NA ( NA%)"),
    P_not_stated = paste(paste(positive_not_stated,
                               (positive_not_stated_percentage), sep = " (")
                         , "%)", sep = ""),
    P_not_stated = na_if(P_not_stated, "NA ( NA%)")) %>%
  select(Author, Population_tested, SARS_CoV_2_negative, N_current_smoker, N_former_smoker, N_current_former_smoker,
         N_never_smoker, N_current_vaper, N_current_smoker_vaper, N_not_stated,
         SARS_CoV_2_positive, P_current_smoker, P_former_smoker, P_current_former_smoker,
         P_never_smoker, P_current_vaper, P_current_smoker_vaper, P_not_stated)

         
a$Author <- to_upper_camel_case(a$Author, sep_out = ", ")
a$Author  <- a$Author  %>%
  recode("Bello, Chavolla" = "Bello-Chavolla",
         "De, Lusignan" = "de Lusignan",
         "Del, Valle" = "del Valle")
a$Author <- to_upper_camel_case(a$Author, sep_out = ", ")
author_list <- plyr::mapvalues(a$Author,
                               from = cleaned_names,
                               to = correct_names)

a$Author <- author_list

numeric_columns <- c('Population_tested',
                     "SARS_CoV_2_negative", "N_current_smoker", "N_former_smoker", "N_current_former_smoker",
                     "N_never_smoker", "N_current_vaper", "N_current_smoker_vaper", "N_not_stated",
                     "SARS_CoV_2_positive", "P_current_smoker", "P_former_smoker", "P_current_former_smoker",
                     "P_never_smoker", "P_current_vaper", "P_current_smoker_vaper", "P_not_stated")
a <- flextable(a) %>%
    colformat_double(j = numeric_columns, digits = 0, na_str = '-', big.mark = ',')

The number of studies reporting on infection by smoking status is: 108

a <- set_header_labels(a,
                       Population_tested = 'Total population tested',
                       SARS_CoV_2_negative = "N (%)",
                       N_current_smoker = "Current smoker (%)",
                       N_former_smoker = "Former smoker (%)",
                       N_current_former_smoker = "Current/former smoker (%)",
                       N_never_smoker = "Never smoker (%)",
                       N_current_vaper = "Current vaper (%)",
                       N_current_smoker_vaper = "Current vaper and smoker (%)",
                       N_not_stated = "Not stated (%)",
                       SARS_CoV_2_positive = "N (%)",
                       P_current_smoker = "Current smoker (%)",
                       P_former_smoker = "Former smoker (%)",
                       P_current_former_smoker = "Current/former smoker (%)",
                       P_never_smoker = "Never smoker (%)",
                       P_current_vaper = "Current vaper (%)",
                       P_current_smoker_vaper = "Current vaper and smoker (%)",
                       P_not_stated = "Not stated (%)") %>%
  add_header_row(top = TRUE, values = c("","SARS-CoV-2 negative", "SARS-CoV-2 positive" ), colwidths = c(2, 8, 8)) %>%
  theme_booktabs() %>%
  fix_border_issues() %>%
  set_table_properties(width = 1, layout = 'autofit')  %>%
  colformat_char(na_str = '-')
a
saveRDS(a, file = here("reports", "tables", "table_2.rds"))
save_as_docx(a, path = here('output_data', 'Table_2.docx'))

6.2.1 Smoking and testing meta-analysis

The steps to reproduce the Bayesian analysis are available above. We generate the treatment effect (mu) and standard deviations (tau) from the studies using the meta package

source(here::here('scripts', 'rr_function.R'))

table_2 <- table_2_word

included_studies <- quality_table_2 %>%
  filter(overall_quality != 'poor') %>%
  select(lead_author) %>%
  filter(!lead_author %in% c("reiter", "hippisley_cox", "ullah", "lowe", "polat"))

good_quality_2 <- quality_table_2 %>%
  filter(overall_quality == 'good')

fair_quality_2 <- quality_table_2 %>%
  filter(lead_author %in% included_studies$lead_author) %>%
  filter(overall_quality == 'fair')

meta <- tibble('author' = table_2$lead_author,
               'negative_smoker' = table_2$negative_current_smoker,
               'negative_never_smoker' = table_2$negative_never_smoker,
               'positive_smoker' = table_2$positive_current_smoker,
               'positive_never_smoker' = table_2$positive_never_smoker,
               'negative_former_smoker' = table_2$negative_former_smoker,
               'positive_former_smoker' = table_2$positive_former_smoker,
               'negative_current_vaper' = table_2$negative_current_vaper,
               'positive_current_vaper' = table_2$postitive_current_vaper,
               'negative_current_smoker_vaper' = table_2$negative_current_smoker_vaper,
               'positive_current_smoker_vaper' = table_2$positive_current_smoker_vaper) %>%
        filter(author %in% included_studies$lead_author)
meta$author <- to_upper_camel_case(meta$author, sep_out = ", ")
meta$author <- plyr::mapvalues(meta$author,
                               from = cleaned_names,
                               to = correct_names)
study_dup <- filter(meta, author != "Niedzwiedz")
study_dup <- study_dup$author

testing_rr_current <- list()
for(study in 1:length(study_dup)){
  testing_rr_current[[study]] <- RR_testing(study_dup[study], "current")
}
testing_rr_former <- list()
for(study in 1:length(study_dup)){
  testing_rr_former[[study]] <- RR_testing(study_dup[study], "former")
}

data <- c(testing_rr_current, testing_rr_former)
k <- do.call(rbind.data.frame, data)

#SEs for Niedzwiedz et al. 2020

#current vs. never

niedz_log_RR_1<-log(1.15)
niedz_log_SE_1<-(log(1.54)-log(0.86))/3.92

k <- k %>%
  add_row(., study = 'Niedzwiedz', smoking_status = 'current', log_RR = niedz_log_RR_1, log_SE = niedz_log_SE_1)
#former vs. never

niedz_log_RR_2<-log(1.42)
niedz_log_SE_2<-(log(1.69)-log(1.19))/3.92

k <- k %>%
  add_row(., study = 'Niedzwiedz', smoking_status = 'former', log_RR = niedz_log_RR_2, log_SE = niedz_log_SE_2)

numbers_in_analysis <- table_2_word %>%
        left_join(., quality_rating, by = 'lead_author') %>%
        filter(., overall_quality != 'poor') %>%
        add_row(lead_author = 'niedzwiedz', contributing_sample = 1474) %>%
        replace_na(list(negative_not_stated = 0, positive_not_stated = 0)) %>%
        mutate(contributing_sample = (contributing_sample - (negative_not_stated+positive_not_stated))) %>%
                       select(lead_author, contributing_sample) %>%
  na.omit()
numbers_in_analysis$lead_author <- to_upper_camel_case(numbers_in_analysis$lead_author, sep_out = ", ")
numbers_in_analysis$lead_author <- plyr::mapvalues(numbers_in_analysis$lead_author,
                               from = cleaned_names,
                               to = correct_names)

running_meta_count <- k %>%
  group_by(study) %>%
  summarise()

if(file.exists(here("data_clean", "bayesian_models", "ecdf_list.rds"))) {
ecdf <- read_rds(here("data_clean", "bayesian_models", "ecdf_list.rds"))
m1_a_ecdf <- ecdf[[1]]
m2_a_ecdf <- ecdf[[2]]
m3_a_ecdf <- ecdf[[3]]
m4_a_ecdf <- ecdf[[4]]
m5_a_ecdf <- ecdf[[5]]
m6_a_ecdf <- ecdf[[6]]
m7_a_ecdf <- ecdf[[7]]
m8_a_ecdf <- ecdf[[8]]
}

The studies included in meta-analysis are rentsch, cho, kolin, de_lusignan, parrotta, israel, eugen_olsen, houlihan, woolford, govind, gu, auvinen, merkely, iversen, fillmore, petrilli, niedzwiedz, carrat, meini, roederer, ioannou, perico, ghinai, gallichotte, barasa, paleiron, didikoglu, kantele, richard, mora, radon, shields, nielsen, gornyk, campbell, mostafa, lombardi, kahlert, and barchuk.

6.2.1.1 Current versus never smokers for testing positive

#current vs. never smokers
current_never_meta <- k %>%
  filter(smoking_status == 'current') %>%
  left_join(., numbers_in_analysis %>%
              rename("study" = lead_author),
            by = c("study")) %>%
  mutate(study = paste(study, contributing_sample, sep = ", n = ")) %>%
  select(-contributing_sample)

a <-metagen(current_never_meta$log_RR,
           current_never_meta$log_SE,
           studlab = current_never_meta$study,
           sm="RR",
           comb.fixed = F, comb.random = T)


write_rds(a, here("data_clean", "bayes_testing_current.rds"))

Current smoking against never smoking for testing positive ECDF for current smoking against never smoking for testing positive

6.2.1.2 Former versus never smokers

#former vs. never smokers

former_never_meta <- k %>%
  filter(smoking_status == 'former') %>%
  left_join(., numbers_in_analysis %>%
              rename("study" = lead_author),
            by = c("study")) %>%
  mutate(study = paste(study, contributing_sample, sep = ", n = ")) %>%
  select(-contributing_sample)

a<-metagen(former_never_meta$log_RR,
           former_never_meta$log_SE,
           studlab = former_never_meta$study,
           sm="RR", comb.fixed = F, comb.random = T)

write_rds(a, here("data_clean", "bayes_testing_former.rds"))

Former smoking against never smoking for testing positive ECDF for former smoking against never smoking for testing positive

6.3 Table 3: Smoking and COVID-19 hospitalisation

#Table 3
table_3_word <- table_3 %>%
  mutate(sample = sample_with_outcome,
         community_percentage = formatC(number_community/sample*100, digits = 2, format = "f"),
         community_current_smoker_percent = formatC(community_current_smoker/number_community*100, digits = 2, format = "f"),
         community_former_smoker_percent = formatC(community_former_smoker/number_community*100, digits = 2, format = "f"),
         community_current_former_smoker_percent = formatC(community_current_former_smoker/number_community*100, digits = 2, format = "f"),
         community_never_smoker_percent = formatC(community_never_smoker/number_community*100, digits = 2, format = "f"),
         community_never_unknown_smoker_percent = formatC(community_never_unknown_smoker/number_community*100, digits = 2, format = "f"),
         community_current_vaper_p = formatC(community_current_vaper/number_community*100, digits = 2, format = "f"),
         community_former_vaper_p = formatC(community_former_vaper/number_community*100, digits = 2, format = "f"),
         community_current_former_vaper_p = formatC(community_current_former_vaper/number_community*100, digits = 2, format = "f"),
         community_not_stated_percent = formatC(community_not_stated/number_community*100, digits = 2, format = "f"),
         number_hospitalised_percent = formatC(number_hospitalised/sample*100, digits = 2, format = "f"),
         hospitalised_current_smoker_percent = formatC(hospitalised_current_smoker/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_former_smoker_percent = formatC(hospitalised_former_smoker/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_current_former_smoker_percent = formatC(hospitalised_current_former_smoker/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_never_smoker_percent = formatC(hospitalised_never_smoker/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_never_unknown_smoker_percent = formatC(hospitalised_never_unknown_smoker/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_current_vaper_p = formatC(hospitalised_never_unknown_smoker/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_former_vaper_p = formatC(hospitalised_former_vaper/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_current_former_vaper_p = formatC(hospitalised_current_former_vaper/number_hospitalised*100, digits = 2, format = "f"),
         hospitalised_not_stated_percent = formatC(hospitalised_not_stated/number_hospitalised*100, digits = 2, format = "f")) %>%
  select(lead_author, sample_with_outcome, number_community, community_percentage,
         community_current_smoker, community_current_smoker_percent, community_former_smoker, 
         community_former_smoker_percent, community_current_former_smoker,
         community_current_former_smoker_percent, community_never_smoker, community_never_smoker_percent,
         community_current_vaper, community_current_vaper_p, community_former_vaper, community_former_vaper_p,
         community_current_former_vaper, community_current_former_vaper_p,
         community_never_unknown_smoker, community_never_unknown_smoker_percent, 
         community_not_stated, community_not_stated_percent, number_hospitalised, number_hospitalised_percent,
         hospitalised_current_smoker, hospitalised_current_smoker_percent, hospitalised_former_smoker,
         hospitalised_former_smoker_percent, hospitalised_current_former_smoker, hospitalised_current_former_smoker_percent,
         hospitalised_never_smoker, hospitalised_never_smoker_percent, hospitalised_current_vaper, hospitalised_current_vaper_p, hospitalised_former_vaper, hospitalised_former_vaper_p,
         hospitalised_current_former_vaper, hospitalised_current_former_vaper_p,
         hospitalised_never_unknown_smoker, hospitalised_never_unknown_smoker_percent, hospitalised_not_stated, hospitalised_not_stated_percent) %>%
  write_rds(here::here('data_clean', 'table_3_word.rds'))

The number of studies reporting on hospitalisation by smoking status is: 55

quality_table_3 <- table_3_word %>%
  left_join(., quality_rating, by = 'lead_author') %>%
  select(lead_author, overall_quality)

a <- table_3_word %>%
  filter(number_community >= 1) %>%
  mutate(Author = lead_author,
         Population = sample_with_outcome,
         Community = paste(
           paste(number_community, 
                 as.integer(community_percentage), sep = " ("), "%)", sep = ""),
         C_current_smoker = paste(paste(community_current_smoker, (community_current_smoker_percent), sep = " (")
                                  , "%)", sep = ""),
         C_current_smoker = na_if(C_current_smoker, "NA ( NA%)"),
         C_former_smoker = paste(paste(community_former_smoker, (community_former_smoker_percent), sep = " (")
                                 , "%)", sep = ""),
         C_former_smoker = na_if(C_former_smoker, "NA ( NA%)"),
         C_current_former_smoker = paste(paste(community_current_former_smoker,
                                               (community_current_former_smoker_percent), sep = " (")
                                         , "%)", sep = ""),
         C_current_former_smoker = na_if(C_current_former_smoker, "NA ( NA%)"),
         C_never_smoker = paste(paste(community_never_smoker,
                                      (community_never_smoker_percent), sep = " (")
                                , "%)", sep = ""),
         C_never_smoker = na_if(C_never_smoker, "NA ( NA%)"),
         C_never_unknown_smoker = paste(paste(community_never_unknown_smoker,
                                              (community_never_unknown_smoker_percent), sep = " (")
                                        , "%)", sep = ""),
         C_never_unknown_smoker = na_if(C_never_unknown_smoker, "NA ( NA%)"),
         C_not_stated = paste(paste(community_not_stated,
                                    (community_not_stated_percent), sep = " (")
                              , "%)", sep = ""),
         C_not_stated = na_if(C_not_stated, "NA ( NA%)")) %>%
  mutate(Hospitalised = paste(
    paste(number_hospitalised, 
          as.integer(number_hospitalised_percent), sep = " ("), "%)", sep = ""),
    H_current_smoker = paste(paste(hospitalised_current_smoker, (hospitalised_current_smoker_percent), sep = " (")
                             , "%)", sep = ""),
    H_current_smoker = na_if(H_current_smoker, "NA ( NA%)"),
    H_former_smoker = paste(paste(hospitalised_former_smoker, (hospitalised_former_smoker_percent), sep = " (")
                            , "%)", sep = ""),
    H_former_smoker = na_if(H_former_smoker, "NA ( NA%)"),
    H_current_former_smoker = paste(paste(hospitalised_current_former_smoker,
                                          (hospitalised_current_former_smoker_percent), sep = " (")
                                    , "%)", sep = ""),
    H_current_former_smoker = na_if(H_current_former_smoker, "NA ( NA%)"),
    H_never_smoker = paste(paste(hospitalised_never_smoker,
                                 (hospitalised_never_smoker_percent), sep = " (")
                           , "%)", sep = ""),
    H_never_smoker = na_if(H_never_smoker, "NA ( NA%)"),
    H_never_unknown_smoker = paste(paste(hospitalised_never_unknown_smoker,
                                         (hospitalised_never_unknown_smoker_percent), sep = " (")
                                   , "%)", sep = ""),
    H_never_unknown_smoker = na_if(H_never_unknown_smoker, "NA ( NA%)"),
    H_not_stated = paste(paste(hospitalised_not_stated,
                               (hospitalised_not_stated_percent), sep = " (")
                         , "%)", sep = ""),
    H_not_stated = na_if(H_not_stated, "NA ( NA%)")) %>%
  select(Author, Population, Community, C_current_smoker, C_former_smoker, C_current_former_smoker,
         C_never_smoker,C_never_unknown_smoker,C_not_stated, Hospitalised, H_current_smoker, H_former_smoker,
         H_current_former_smoker, H_never_smoker, H_never_unknown_smoker, H_not_stated)

         
a$Author <- to_upper_camel_case(a$Author, sep_out = ", ")
a$Author <- plyr::mapvalues(a$Author,
                               from = cleaned_names,
                               to = correct_names)

numeric_columns <- 'Population'

a <- flextable(a) %>%
  colformat_double(j = numeric_columns, digits = 0, na_str = '-', big.mark = ',') %>%
  colformat_char(na_str = '-')

a <- set_header_labels(a,
                       Population = 'Population with outcome',
                       Community = "N (%)",
                       C_current_smoker = "Current smoker (%)",
                       C_former_smoker = "Former smoker (%)",
                       C_current_former_smoker = "Current/former smoker (%)",
                       C_never_smoker = "Never smoker (%)",
                       C_never_unknown_smoker = "Never/unknown smoker (%)",
                       C_not_stated = "Not stated (%)",
                       Hospitalised = "N (%)",
                       H_current_smoker = "Current smoker (%)",
                       H_former_smoker = "Former smoker (%)",
                       H_current_former_smoker = "Current/former smoker (%)",
                       H_never_smoker = "Never smoker (%)",
                       H_never_unknown_smoker = "Never/unknown smoker (%)",
                       H_not_stated = "Not stated (%)") %>%
  add_header_row(top = TRUE, values = c("","Community", "Hospitalised" ), colwidths = c(2, 7, 7)) %>%
  theme_booktabs() %>%
  fix_border_issues() %>%
  set_table_properties(width = 1, layout = 'autofit')
a
saveRDS(a, file = here("reports", "tables", "table_3.rds"))
save_as_docx(a, path = here('output_data', 'Table_3.docx'))

6.3.1 Smoking and hospitalisation meta-analysis

The studies included in meta-analysis are rentsch, argenziano, yanover, hamer, miyara_medrxiv, garassino, gu, petrilli, israel_schaffer, ioannou, zhang_li, jehi, arleo, didikoglu, ho_narasimhan, saurabh, patone, wang_codd, and puebla_neira.

6.3.1.1 Current versus never smoking for hospitalisation

# Data --------------------------------------------------------------------
meta <- tibble('author' = table_3$lead_author,
               'community_smoker' = table_3$community_current_smoker,
               'community_never_smoker' = table_3$community_never_smoker, 
               'hospitalised_smoker' = table_3$hospitalised_current_smoker, 
               'hospitalised_never_smoker' = table_3$hospitalised_never_smoker,
               'community_former_smoker' = table_3$community_former_smoker,
               'hospitalised_former_smoker' = table_3$hospitalised_former_smoker) %>%
  filter(author %in% included_studies$lead_author)

meta$author <- to_upper_camel_case(meta$author, sep_out = ", ")
meta$author <- plyr::mapvalues(meta$author,
                               from = cleaned_names,
                               to = correct_names)  

# Current smoker hospitalisation ------------------------------------------
event_rates_smoker <- meta %>%
  mutate(., Ee = hospitalised_smoker) %>%
  mutate(., Ne = (hospitalised_smoker+community_smoker)) %>%
  mutate(., Ec = hospitalised_never_smoker) %>%
  mutate(., Nc = (hospitalised_never_smoker+community_never_smoker)) %>%
  rename('Author' = author) %>%
  select(Author, Ee, Ne, Ec, Nc)

event_rates_smoker <- metabin(Ee,
                              Ne,
                              Ec,
                              Nc,
                              data = event_rates_smoker,
                              studlab = paste(Author),
                              comb.fixed = F,
                              comb.random = T,
                              method.tau = 'SJ',
                              hakn = T,
                              prediction = F,
                              incr = 0.1,
                              sm = 'RR')

write_rds(event_rates_smoker, here("data_clean", "bayes_hospital_current.rds"))

Current smoking against never smoking for hospitalisation

ECDF for current smoking against never smoking for hospitalisation

6.3.1.2 Former versus never smoking for hospitalisation

# Former smoker hospitalisation -------------------------------------------
event_rates_former <- meta %>%
  mutate(., Ee = hospitalised_former_smoker) %>%
  mutate(., Ne = (hospitalised_former_smoker+community_former_smoker)) %>%
  mutate(., Ec = hospitalised_never_smoker) %>%
  mutate(., Nc = (hospitalised_never_smoker+community_never_smoker)) %>%
  rename('Author' = author) %>%
  select(Author, Ee, Ne, Ec, Nc)

event_rates_former <- metabin(Ee,
                              Ne,
                              Ec,
                              Nc,
                              data = event_rates_former,
                              studlab = paste(Author),
                              comb.fixed = F,
                              comb.random = T,
                              method.tau = 'SJ',
                              hakn = T,
                              prediction = F,
                              incr = 0.1,
                              sm = 'RR')

running_meta_count <- running_meta_count %>%
  full_join(., meta %>%
  rename("study" = author), by = "study") %>%
  select(study)

write_rds(event_rates_former, here("data_clean", "bayes_hospital_former.rds"))

Former smoking against never smoking for hospitalisation ECDF for Former smoking against never smoking for hospitalisation

6.4 Table 4: Smoking and COVID-19 disease severity

#Table 4
table_4_word <- table_4 %>%
  mutate(sample = sample_with_severity,
         non_severe_disease_percentage = formatC(non_severe_disease/sample*100, digits = 2, format = "f"),
         non_severe_current_smoker_percent = formatC(non_severe_current_smoker/non_severe_disease*100, digits = 2, format = "f"),
         non_severe_former_smoker_percent = formatC(non_severe_former_smoker/non_severe_disease*100, digits = 2, format = "f"),
         non_severe_current_former_smoker_percent = formatC(non_severe_current_former_smoker/non_severe_disease*100, digits = 2, format = "f"),
         non_severe_never_smoker_percent = formatC(non_severe_never_smoker/non_severe_disease*100, digits = 2, format = "f"),
         non_severe_never_unknown_smoker_percent = formatC(non_severe_never_unknown_smoker/non_severe_disease*100, digits = 2, format = "f"),
         non_severe_not_stated_percent = formatC(non_severe_not_stated/non_severe_disease*100, digits = 2, format = "f"),
         severe_disease_number_percent = formatC(severe_disease_number/sample*100, digits = 2, format = "f"),
         severe_disease_current_smoker_percent = formatC(severe_disease_current_smoker/severe_disease_number*100, digits = 2, format = "f"),
         severe_disease_former_smoker_percent = formatC(severe_disease_former_smoker/severe_disease_number*100, digits = 2, format = "f"),
         severe_disease_current_former_smoker_percent = formatC(severe_disease_current_former_smoker/severe_disease_number*100, digits = 2, format = "f"),
         severe_disease_never_smoker_percent = formatC(severe_disease_never_smoker/severe_disease_number*100, digits = 2, format = "f"),
         severe_disease_never_unknown_percent = formatC(severe_disease_never_unknown/severe_disease_number*100, digits = 2, format = "f"),
         severe_disease_not_stated_percent = formatC(severe_disease_not_stated/severe_disease_number*100, digits = 2, format = "f")) %>%
  select(lead_author, sample, non_severe_disease, non_severe_disease_percentage, non_severe_current_smoker,
         non_severe_current_smoker_percent, non_severe_former_smoker, non_severe_former_smoker_percent,
         non_severe_current_former_smoker, non_severe_current_former_smoker_percent, non_severe_never_smoker,
         non_severe_never_smoker_percent, non_severe_never_unknown_smoker, non_severe_never_unknown_smoker_percent,
         non_severe_not_stated, non_severe_not_stated_percent, severe_disease_number, severe_disease_number_percent,
         severe_disease_current_smoker, severe_disease_current_smoker_percent, 
         severe_disease_former_smoker, severe_disease_former_smoker_percent, severe_disease_current_former_smoker,
         severe_disease_current_former_smoker_percent, severe_disease_never_smoker, severe_disease_never_smoker_percent,
         severe_disease_never_unknown, severe_disease_never_unknown_percent, severe_disease_not_stated,
         severe_disease_not_stated_percent) %>%
  write_rds(here::here('data_clean', 'table_4_word.rds'))

The number of studies reporting on hospitalisation by smoking status is: 91

quality_table_4 <- table_4_word %>%
  left_join(., quality_rating, by = 'lead_author') %>%
  select(lead_author, overall_quality)

a <- table_4_word %>%
  mutate(Author = lead_author,
         Population = sample,
         non_severe_disease = paste(
           paste(non_severe_disease, 
                 as.integer(non_severe_disease_percentage), sep = " ("), "%)", sep = ""),
         ns_current_smoker = paste(paste(non_severe_current_smoker, (non_severe_current_smoker_percent), sep = " (")
                                  , "%)", sep = ""),
         ns_current_smoker = na_if(ns_current_smoker, "NA ( NA%)"),
         ns_former_smoker = paste(paste(non_severe_former_smoker, (non_severe_former_smoker_percent), sep = " (")
                                 , "%)", sep = ""),
         ns_former_smoker = na_if(ns_former_smoker, "NA ( NA%)"),
         ns_current_former_smoker = paste(paste(non_severe_current_former_smoker,
                                               (non_severe_current_former_smoker_percent), sep = " (")
                                         , "%)", sep = ""),
         ns_current_former_smoker = na_if(ns_current_former_smoker, "NA ( NA%)"),
         ns_never_smoker = paste(paste(non_severe_never_smoker,
                                      (non_severe_never_smoker_percent), sep = " (")
                                , "%)", sep = ""),
         ns_never_smoker = na_if(ns_never_smoker, "NA ( NA%)"),
         ns_never_unknown_smoker = paste(paste(non_severe_never_unknown_smoker,
                                              (non_severe_never_unknown_smoker_percent), sep = " (")
                                        , "%)", sep = ""),
         ns_never_unknown_smoker = na_if(ns_never_unknown_smoker, "NA ( NA%)"),
         ns_not_stated = paste(paste(non_severe_not_stated,
                                    (non_severe_not_stated_percent), sep = " (")
                              , "%)", sep = ""),
         ns_not_stated = na_if(ns_not_stated, "NA ( NA%)"),
         ) %>%
  mutate(severe_disease = paste(
    paste(severe_disease_number, 
          as.integer(severe_disease_number_percent), sep = " ("), "%)", sep = ""),
    s_current_smoker = paste(paste(severe_disease_current_smoker, (severe_disease_current_smoker_percent), sep = " (")
                             , "%)", sep = ""),
    s_current_smoker = na_if(s_current_smoker, "NA ( NA%)"),
    s_former_smoker = paste(paste(severe_disease_former_smoker, (severe_disease_former_smoker_percent), sep = " (")
                            , "%)", sep = ""),
    s_former_smoker = na_if(s_former_smoker, "NA ( NA%)"),
    s_current_former_smoker = paste(paste(severe_disease_current_former_smoker,
                                          (severe_disease_current_former_smoker_percent), sep = " (")
                                    , "%)", sep = ""),
    s_current_former_smoker = na_if(s_current_former_smoker, "NA ( NA%)"),
    s_never_smoker = paste(paste(severe_disease_never_smoker,
                                 (severe_disease_never_smoker_percent), sep = " (")
                           , "%)", sep = ""),
    s_never_smoker = na_if(s_never_smoker, "NA ( NA%)"),
    s_never_unknown_smoker = paste(paste(severe_disease_never_unknown,
                                         (severe_disease_never_unknown_percent), sep = " (")
                                   , "%)", sep = ""),
    s_never_unknown_smoker = na_if(s_never_unknown_smoker, "NA ( NA%)"),
    s_not_stated = paste(paste(severe_disease_not_stated,
                                 (severe_disease_not_stated_percent), sep = " (")
                           , "%)", sep = ""),
    s_not_stated = na_if(s_not_stated, "NA ( NA%)")) %>%
  select(Author, Population, non_severe_disease, ns_current_smoker, ns_former_smoker,
         ns_current_former_smoker, ns_never_smoker, ns_never_unknown_smoker, ns_not_stated,
         severe_disease, s_current_smoker, s_former_smoker, s_current_former_smoker, s_never_smoker, s_never_unknown_smoker,
         s_not_stated)

         
a$Author <- to_upper_camel_case(a$Author, sep_out = ", ")
a$Author <- plyr::mapvalues(a$Author,
                               from = cleaned_names,
                               to = correct_names)  
numeric_columns <- 'Population'

a <- flextable(a) %>%
  colformat_double(j = numeric_columns, digits = 0, na_str = '-', big.mark = ',') %>%
  colformat_char(na_str = '-')

a <- set_header_labels(a,
                       Population = "Population with severity",
                       non_severe_disease = "N (%)",
                       ns_current_smoker = "Current smoker (%)",
                       ns_former_smoker = "Former smoker (%)",
                       ns_current_former_smoker = "Current/former smoker (%)",
                       ns_never_unknown_smoker = "Never/unknown smoker (%)",
                       ns_never_smoker = "Never smoker (%)",
                       ns_not_stated = "Not stated (%)",
                       severe_disease = "N (%)", 
                       s_current_smoker = "Current smoker (%)", 
                       s_former_smoker = "Former smoker (%)", 
                       s_current_former_smoker = "Current/former smoker (%)",
                       s_never_unknown_smoker = "Never/unknown smoker (%)",
                       s_never_smoker = "Never smoker (%)",
                       s_not_stated = "Not stated (%)") %>%
  add_header_row(top = TRUE, values = c("","Non severe disease", "Severe disease" ), colwidths = c(2, 7, 7)) %>%
  theme_booktabs() %>%
  fix_border_issues() %>%
  set_table_properties(width = 1, layout = 'autofit')
a
saveRDS(a, file = here("reports", "tables", "table_4.rds"))
save_as_docx(a, path = here('output_data', 'Table_4.docx'))

6.4.1 Smoking and severity meta-analysis

The studies included in meta-analysis are guan_ni, rentsch, hadjadj, feuth, kuderer, gu, monteiro, petrilli, caliskan, ho_narasimhan, saurabh, chousein, phosp_covid_collaborative_group, faverio, jha_shrestha, rachmawati, and hojbjerg_lassen.

6.4.1.1 Current versus never smoking for risk of severe disease

# Data --------------------------------------------------------------------
meta <- tibble('author' = table_4$lead_author,
               'non_severe_smoker' = table_4$non_severe_current_smoker,
               'non_severe_never_smoker' = table_4$non_severe_never_smoker,
               'severe_smoker' = table_4$severe_disease_current_smoker,
               'severe_never_smoker' = table_4$severe_disease_never_smoker,
               'non_severe_former_smoker' = table_4$non_severe_former_smoker,
               'severe_former_smoker' = table_4$severe_disease_former_smoker) %>%
  filter(author %in% included_studies$lead_author) %>%
  filter(author != "hippisley_cox")

meta$author <- to_upper_camel_case(meta$author, sep_out = ", ")
meta$author <- plyr::mapvalues(meta$author,
                               from = cleaned_names,
                               to = correct_names)  

# Current smoker severity ------------------------------------------
event_rates_smoker <- meta %>%
  mutate(., Ee = severe_smoker) %>%
  mutate(., Ne = (severe_smoker+non_severe_smoker)) %>%
  mutate(., Ec = severe_never_smoker) %>%
  mutate(., Nc = (severe_never_smoker+non_severe_never_smoker)) %>%
  rename('Author' = author) %>%
  select(Author, Ee, Ne, Ec, Nc)

event_rates_smoker <- metabin(Ee,
                              Ne,
                              Ec,
                              Nc,
                              data = event_rates_smoker,
                              studlab = paste(Author),
                              comb.fixed = F,
                              comb.random = T,
                              method.tau = 'SJ',
                              hakn = F,
                              prediction = F,
                              incr = 0.1,
                              sm = 'RR')

running_meta_count <- running_meta_count %>%
  full_join(., meta %>%
  rename("study" = author), by = "study") %>%
  select(study)

write_rds(event_rates_smoker, here("data_clean", "bayes_severity_current.rds"))

Current smoking against never smoking for disease severity ECDF for current smoking against never smoking for disease severity

6.4.1.2 Former versus never smoking for risk of severe disease

# Former smoker severity -------------------------------------------
event_rates_former <- meta %>%
  mutate(., Ee = severe_former_smoker) %>%
  mutate(., Ne = (severe_former_smoker+non_severe_former_smoker)) %>%
  mutate(., Ec = severe_never_smoker) %>%
  mutate(., Nc = (severe_never_smoker+non_severe_never_smoker)) %>%
  rename('Author' = author) %>%
  select(Author, Ee, Ne, Ec, Nc)

event_rates_former <- metabin(Ee,
                              Ne,
                              Ec,
                              Nc,
                              data = event_rates_former,
                              studlab = paste(Author),
                              comb.fixed = F,
                              comb.random = T,
                              method.tau = 'SJ',
                              hakn = F,
                              prediction = F,
                              incr = 0.1,
                              sm = 'RR')

write_rds(event_rates_former, here("data_clean", "bayes_severity_former.rds"))

Former smoking against never smoking for disease severity ECDF for former smoking against never smoking for disease severity

6.5 Table 5: Smoking and COVID-19 mortality

#Table 5
table_5_word <- table_5 %>%
  mutate(sample = sample_with_deaths,
         deaths_percentage = formatC(deaths/sample*100, digits = 2, format = "f"),
         death_current_smokers_percent = formatC(death_current_smokers/deaths*100, digits = 2, format = "f"),
         death_former_smokers_percent = formatC(death_former_smokers/deaths*100, digits = 2, format = "f"),
         death_current_former_smokers_percent = formatC(death_current_former_smokers/deaths*100, digits = 2, format = "f"),
         death_never_smokers_percent = formatC(death_never_smokers/deaths*100, digits = 2, format = "f"),
         death_never_unknown_smokers_percent = formatC(death_never_unknown_smokers/deaths*100, digits = 2, format = "f"),
         death_not_stated_percent = formatC(death_not_stated/deaths*100, digits = 2, format = "f"),
         recovered_percentage = formatC(recovered/sample*100, digits = 2, format = "f"),
         recovered_current_smokers_percent = formatC(recovered_current_smoking/recovered*100, digits = 2, format = "f"),
         recovered_former_smokers_percent = formatC(recovered_former_smoker/recovered*100, digits = 2, format = "f"),
         recovered_current_former_smokers_percent = formatC(recovered_current_former_smokers/recovered*100, digits = 2, format = "f"),
         recovered_never_smokers_percent = formatC(recovered_never_smoker/recovered*100, digits = 2, format = "f"),
         recovered_never_unknown_smokers_percent = formatC(recovered_never_unknown_smoker/recovered*100, digits = 2, format = "f"),
         recovered_not_stated_percent = formatC(recovered_not_stated/recovered*100, digits = 2, format = "f")) %>%
  select(lead_author, sample, recovered, recovered_percentage, recovered_current_smoking, recovered_current_smokers_percent,
         recovered_former_smoker, recovered_former_smokers_percent, recovered_current_former_smokers,
         recovered_current_former_smokers_percent, recovered_never_smoker, recovered_never_smokers_percent,
         recovered_never_unknown_smoker, recovered_never_unknown_smokers_percent, recovered_not_stated,
         recovered_not_stated_percent,
         deaths, deaths_percentage, death_current_smokers, death_current_smokers_percent,
         death_former_smokers, death_former_smokers_percent, death_current_former_smokers,
         death_current_former_smokers_percent, death_never_smokers, death_never_smokers_percent,
         death_never_unknown_smokers, death_never_unknown_smokers_percent, death_not_stated, death_not_stated_percent) %>%
  write_rds(here::here('data_clean', 'table_5_word.rds'))

The number of studies reporting on death by smoking status is: 116

quality_table_5 <- table_5_word %>%
  left_join(., quality_rating, by = 'lead_author') %>%
  select(lead_author, overall_quality)

fair_quality_5 <- quality_table_5 %>%
  filter(lead_author %in% included_studies$lead_author) %>%
  filter(overall_quality != 'fair')

a <- table_5_word %>%
  mutate(Author = lead_author,
         Population = sample,
         recovered = paste(
           paste(recovered, 
                 as.integer(recovered_percentage), sep = " ("), "%)", sep = ""),
         recovered_current_smoker = paste(paste(recovered_current_smoking, 
                                                 (recovered_current_smokers_percent), sep = " (")
                                  , "%)", sep = ""),
         recovered_current_smoker = na_if(recovered_current_smoker, "NA ( NA%)"),
         recovered_former_smoker = paste(paste(recovered_former_smoker,
                                               (recovered_former_smokers_percent), sep = " (")
                                 , "%)", sep = ""),
         recovered_former_smoker = na_if(recovered_former_smoker, "NA ( NA%)"),
         recovered_current_former_smoker = paste(paste(recovered_current_former_smokers,
                                               (recovered_current_former_smokers_percent), sep = " (")
                                         , "%)", sep = ""),
         recovered_current_former_smoker = na_if(recovered_current_former_smoker, "NA ( NA%)"),
         recovered_never_smoker = paste(paste(recovered_never_smoker,
                                      (recovered_never_smokers_percent), sep = " (")
                                , "%)", sep = ""),
         recovered_never_smoker = na_if(recovered_never_smoker, "NA ( NA%)"),
         recovered_never_unknown_smoker = paste(paste(recovered_never_unknown_smoker,
                                              (recovered_never_unknown_smokers_percent), sep = " (")
                                        , "%)", sep = ""),
         recovered_never_unknown_smoker = na_if(recovered_never_unknown_smoker, "NA ( NA%)"),
         recovered_not_stated = paste(paste(recovered_not_stated,
                                    (recovered_not_stated_percent), sep = " (")
                              , "%)", sep = ""),
         recovered_not_stated = na_if(recovered_not_stated, "NA ( NA%)")) %>%
  mutate(deaths = paste(
           paste(deaths, 
                 as.integer(deaths_percentage), sep = " ("), "%)", sep = ""),
         death_current_smoker = paste(paste(death_current_smokers, 
                                                 (death_current_smokers_percent), sep = " (")
                                  , "%)", sep = ""),
         death_current_smoker = na_if(death_current_smoker, "NA ( NA%)"),
         death_former_smoker = paste(paste(death_former_smokers,
                                               (death_former_smokers_percent), sep = " (")
                                 , "%)", sep = ""),
         death_former_smoker = na_if(death_former_smoker, "NA ( NA%)"),
         death_current_former_smoker = paste(paste(death_current_former_smokers,
                                               (death_current_former_smokers_percent), sep = " (")
                                         , "%)", sep = ""),
         death_current_former_smoker = na_if(death_current_former_smoker, "NA ( NA%)"),
         death_never_smoker = paste(paste(death_never_smokers,
                                      (death_never_smokers_percent), sep = " (")
                                , "%)", sep = ""),
         death_never_smoker = na_if(death_never_smoker, "NA ( NA%)"),
         death_never_unknown_smoker = paste(paste(death_never_unknown_smokers,
                                              (death_never_unknown_smokers_percent), sep = " (")
                                        , "%)", sep = ""),
         death_never_unknown_smoker = na_if(death_never_unknown_smoker, "NA ( NA%)"),
         death_not_stated = paste(paste(death_not_stated,
                                    (death_not_stated_percent), sep = " (")
                              , "%)", sep = ""),
         death_not_stated = na_if(death_not_stated, "NA ( NA%)")) %>%
  select(Author, Population, recovered, recovered_current_smoker, recovered_former_smoker, recovered_current_former_smoker,
         recovered_never_smoker, recovered_never_unknown_smoker, recovered_not_stated, deaths, death_current_smoker,
         death_former_smoker, death_current_former_smoker, death_never_smoker, death_never_unknown_smoker,
         death_not_stated)

         
a$Author <- to_upper_camel_case(a$Author, sep_out = ", ")
a$Author <- plyr::mapvalues(a$Author,
                               from = cleaned_names,
                               to = correct_names)  

numeric_columns <- 'Population'

a <- flextable(a) %>%
  colformat_double(j = numeric_columns, digits = 0, na_str = '-', big.mark = ',') %>%
  colformat_char(na_str = '-')

a <- set_header_labels(a,
                       Population = "Population with mortality",
                       recovered = "N (%)",
                       recovered_current_smoker = "Current smoker (%)",
                       recovered_former_smoker = "Former smoker (%)",
                       recovered_current_former_smoker = "Current/former smoker (%)",
                       recovered_never_unknown_smoker = "Never/unknown smoker (%)",
                       recovered_never_smoker = "Never smoker (%)",
                       recovered_not_stated = "Not stated (%)",
                       deaths = "N (%)", 
                       death_current_smoker = "Current smoker (%)", 
                       death_former_smoker = "Former smoker (%)", 
                       death_current_former_smoker = "Current/former smoker (%)",
                       death_never_unknown_smoker = "Never/unknown smoker (%)",
                       death_never_smoker = "Never smoker (%)",
                       death_not_stated = "Not stated (%)") %>%
  add_header_row(top = TRUE, values = c("","Recovered", "Died" ), colwidths = c(2, 7, 7)) %>%
  theme_booktabs() %>%
  fix_border_issues() %>%
  set_table_properties(width = 1, layout = 'autofit')
a
saveRDS(a, file = here("reports", "tables", "table_5.rds"))
save_as_docx(a, path = here('output_data', 'Table_5.docx'))

6.5.1 Smoking and COVID-19 mortality meta-analysis

The studies included in meta-analysis are al_hindawi, kuderer, miyara_medrxiv, chen_yu, gu, thompson, bernaola, philipose, fillmore, rubio_rivas, ullah, ioannou, lamure, zinellu, zhang_li, adrish, sourij, simons, caliskan, ho_narasimhan, saurabh, dambha_miller, williamson_tazare, dayem_ullah, greenbaum, madakkatel, demichev, puebla_neira, prats_uribe_xie, bruce, bhaskaran, and aboueshia.

6.5.1.1 Current versus never smoking for disease mortality

# Data --------------------------------------------------------------------
meta <- tibble('author' = table_5$lead_author,
               'recovered_current_smoker' = table_5$recovered_current_smoking,
               'recovered_never_smoker' = table_5$recovered_never_smoker,
               'death_current_smoker' = table_5$death_current_smokers,
               'death_never_smoker' = table_5$death_never_smokers,
               'recovered_former_smoker' = table_5$recovered_former_smoker,
               'death_former_smoker' = table_5$death_former_smokers) %>%
  filter(author %in% included_studies$lead_author)

meta$author <- to_upper_camel_case(meta$author, sep_out = ", ")
meta$author <- plyr::mapvalues(meta$author,
                               from = cleaned_names,
                               to = correct_names)  


# Current smoker mortality ------------------------------------------
event_rates_smoker <- meta %>%
  mutate(., Ee = death_current_smoker) %>%
  mutate(., Ne = (death_current_smoker+recovered_current_smoker)) %>%
  mutate(., Ec = death_never_smoker) %>%
  mutate(., Nc = (death_never_smoker+recovered_never_smoker)) %>%
  rename('Author' = author) %>%
  select(Author, Ee, Ne, Ec, Nc)

event_rates_smoker <- metabin(Ee,
                              Ne,
                              Ec,
                              Nc,
                              data = event_rates_smoker,
                              studlab = paste(Author),
                              comb.fixed = F,
                              comb.random = T,
                              method.tau = 'SJ',
                              hakn = F,
                              prediction = F,
                              incr = 0.1,
                              sm = 'RR')

write_rds(event_rates_smoker, here("data_clean", "bayes_mortality_current.rds"))

Current smoking against never smoking for mortality ECDF for current smoking against never smoking for mortality

6.5.1.2 Former versus never smoking for disease mortality

# Former smoker mortality ------------------------------------------
event_rates_former <- meta %>%
  mutate(., Ee = death_former_smoker) %>%
  mutate(., Ne = (death_former_smoker+recovered_former_smoker)) %>%
  mutate(., Ec = death_never_smoker) %>%
  mutate(., Nc = (death_never_smoker+recovered_never_smoker)) %>%
  rename('Author' = author) %>%
  select(Author, Ee, Ne, Ec, Nc)

event_rates_former <- metabin(Ee,
                              Ne,
                              Ec,
                              Nc,
                              data = event_rates_former,
                              studlab = paste(Author),
                              comb.fixed = F,
                              comb.random = T,
                              method.tau = 'SJ',
                              hakn = F,
                              prediction = F,
                              incr = 0.1,
                              sm = 'RR')

write_rds(event_rates_former, here("data_clean", "bayes_mortality_former.rds"))

running_meta_count <- running_meta_count %>%
  full_join(., meta %>%
  rename("study" = author), by = "study") %>%
  select(study)

Former smoking against never smoking for mortality ECDF for former smoking against never smoking for mortality

6.6 Study quality assessment

#Table 6
good <- filter(quality_rating, quality_rating$overall_quality == "good" ) %>%
  filter(!lead_author %in% exclude_from_analysis) %>%
  nrow()
fair <- filter(quality_rating, quality_rating$overall_quality == "fair" ) %>%
  filter(!lead_author %in% exclude_from_analysis) %>%
  nrow()
poor <- filter(quality_rating, quality_rating$overall_quality == "poor" ) %>%
  filter(!lead_author %in% exclude_from_analysis) %>%
  nrow()

The number of Good quality studies are 11, Fair quality studies are 129 with 406 Poor quality studies

The number of studies included in Meta-analysis are 87

7 Final PRISMA table

prisma <- prisma %>%
  mutate("Studies included in meta-analysis" = nrow(running_meta_count))
flextable(prisma)

8 Supplementary tables

8.1 Supplementary tables

8.1.1 Data on clinical or virological diagnosis and studies stratifying smoking status

a <- data_study_general %>%
  filter(review_version %in% analysed_versions) %>%
  select(lead_author, study_design, clinical_diagnosis, stratified_smoking) %>%
  filter(!lead_author %in% exclude_from_analysis) %>%
  rename("Lead author" = "lead_author",
         "Study design" = "study_design",
         "Clinical diagnoses included" = "clinical_diagnosis",
         "Stratified smoking reported" = "stratified_smoking") %>%
 mutate("Lead author" = to_upper_camel_case(`Lead author`, sep_out = ", "),
        "Lead author" = plyr::mapvalues(`Lead author`,
                               from = cleaned_names,
                               to = correct_names),
        "Study design" = to_upper_camel_case(`Study design`, sep_out = " "))

a <- a %>%
  flextable() %>%
    set_caption(caption = 'Studies design, inclusion of clinical diagnoses and stratified smoking') %>%
  set_table_properties(width = 1, layout = 'autofit')

save_as_docx(a, path = here('output_data', 'Supplementary_2.docx'))

8.1.2 Country level smoking estimates

country_smoking <-  readxl::read_xlsx(here("data_spreadsheet", "data_extraction_current.xlsx"), sheet = 'national_smoking_prevalence') %>%
  flextable()
save_as_docx(country_smoking, path = here('output_data', 'Supplementary_3.docx'))

8.1.3 Sensitivity